AceReason Collection Math and Code reasoning model trained through reinforcement learning (RL) • 4 items • Updated about 14 hours ago • 6
Effective Backdoor Mitigation in Vision-Language Models Depends on the Pre-training Objective Paper • 2311.14948 • Published Nov 25, 2023
Countering Language Drift with Seeded Iterated Learning Paper • 2003.12694 • Published Mar 28, 2020 • 1
Recall Traces: Backtracking Models for Efficient Reinforcement Learning Paper • 1804.00379 • Published Apr 2, 2018
Supervised Seeded Iterated Learning for Interactive Language Learning Paper • 2010.02975 • Published Oct 6, 2020
Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment Paper • 2502.00203 • Published Jan 31 • 2
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models Paper • 2504.03624 • Published Apr 4 • 13
FFN Fusion: Rethinking Sequential Computation in Large Language Models Paper • 2503.18908 • Published Mar 24 • 19
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning Paper • 2503.15558 • Published Mar 18 • 49
V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models Paper • 2502.09980 • Published Feb 14 • 4
SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP Paper • 2408.10202 • Published Aug 19, 2024